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Abstract — Deregulation of energy markets, penetration of 
renewables, advanced metering capabilities, and the urge for 
situational awareness, all call for system-wide power system 
state estimation (PSSE). Implementing a centralized estimator 
though is practically infeasible due to the complexity scale of 
an interconnection, the communication bottleneck in real-time 
monitoring, regional disclosure policies, and reliability issues. 
In this context, distributed PSSE methods are treated here 
under a unified and systematic framework. A novel algorithm 
is developed based on the alternating direction method of 
multipliers. It leverages existing PSSE solvers, respects privacy 
policies, exhibits low communication load, and its convergence 
to the centralized estimates is guaranteed even in the absence of 
local observability. Beyond the conventional least-squares based 
PSSE, the decentralized framework accommodates a robust state 
estimator. By exploiting interesting links to the compressive 
sampling advances, the latter jointly estimates the state and 
identifies corrupted measurements. The novel algorithms are 
numerically evaluated using the IEEE 14-, 118-bus, and a 4,200- 
bus benchmarks. Simulations demonstrate that the attainable 
accuracy can be reached within a few inter-area exchanges, while 
largest residual tests are outperformed. 

Index Terms — Alternating direction method of multipliers; bad 
data identification; Huber's function; phasor measurement units; 
SCADA measurements; multi-area state estimation. 

I. Introduction 

Power system state estimation (PSSE) has been traditionally 
performed at regional control centers with limited interaction. 
However, due to the deregulation of energy markets, large 
amounts of power are transferred over high-rate, long-distance 
lines spanning several control areas ifTTI . These so-called tie 
lines, originally constructed for emergency situations, are now 
fully operational and must be accurately monitored. Since 
any control area can be strongly affected by events and 
decisions elsewhere, independent system operators (ISOs) can 
no longer operate in a truly independent fashion. The ongoing 
penetration of renewable sources further intensifies inter-area 
power transfers, while their intermittent nature necessitates 
more frequent state acquisition. At the same time, the advances 
in metering infrastructure are unprecedented: phasor mea- 
surement units (PMUs) provide finely-sampled voltage and 
current phasors, synchronized across the grid; smart meters 
reach the distribution level; and networked processors are 
being installed throughout the grid J6), IfTTI . The abundance 
and diversity of measurements offer advanced monitoring 
capabilities, but processing them constitutes a major challenge, 
which is exacerbated in the presence of malicious data attacks 
and bad data EH, UJ Ch. 5-6]. 

There are two key issues in modernizing the power grid 
monitoring infrastructure: Firstly, PSSE should be performed 
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at the interconnection level. Yet an interconnection may in- 
clude thousands of buses, while 2-3 measurements per state 
are typically needed. Requiring also real-time processing along 
with resilience to corrupted data render centralized state es- 
timation computationally formidable. Further, a centralized 
approach is vulnerable and is not flexible when it comes to 
policy and privacy issues. Secondly, decentralizing information 
processing for the power grid can be performed at several 
hierarchies Jl 1 ]: PMU measurements can be processed by pha- 
sor data concentrators (PDCs) lf26l : conventional supervisory 
control and data acquisition (SCADA) measurements together 
with PDC fused data can be aggregated by the ISO; and finally, 
estimates from ISOs can be merged at the interconnection 
level. These considerations corroborate that distributed PSSE 
and bad data analysis are essential for realizing the smart grid 
vision. 

Existing distributed methods for PSSE and bad data analysis 
are reviewed in Section [II] The PSSE problem, its unique 
requirements and challenges are highlighted in Section [III] In 
Section[lV] a new distributed PSSE methodology is developed. 
Based on the alternating direction method of multipliers (0, 
a systematic cooperation between local control centers is en- 
abled with unique features: it facilitates several practical PSSE 
formulations; it lowers the overhead for inter-area information 
exchanges; its convergence is guaranteed regardless of local 
observability or parameter tuning; and the resultant algorithm 
can be executed by solvers already in use at local control 
centers. Building on this framework, a robust decentralized 
estimator is derived in Section [V] Different from the con- 
ventional two-step bad data analysis, the novel approach im- 
plements Huber's M-estimator |fl~) in a decentralized manner, 
while PSSE is accomplished jointly with bad data removal. 
Leveraging sparsity of the introduced bad data vectors, the new 
algorithm augments standard PSSE solvers by a few iterations. 
The novel robust decentralized algorithms are numerically 
evaluated in Section lVTI and the paper is wrapped up in Section 
IVIII Regarding notation, lower- (upper-) case boldface letters 
denote column vectors (matrices), and calligraphic letters stand 
for sets. The notation (-) T denotes transposition, while := 
defines a symbol variable. 

II. Related Work 

Distributed solutions were pursued since the statistical for- 
mulation of PSSE 1271 . when it was realized that for a chain 
of serially interconnected areas, Kalman filter-type updates 
can be invoked incrementally in space 11271 Part III]. For 
arbitrarily connected areas though, a two-level approach with a 
global coordinator is required lf2"7l . Several renditions of this 
hierarchical approach can be found in J5], iTOI , J3T], IfTTI . 
ifTSTl . Most of these presume local observability, meaning that 
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local states estimated excluding boundary bus measurements 
are uniquely identifiable. Such an assumption may not hold 
due to bad data removal or because PSSE is performed at a 
lower than the ISO level. The need for a coordinator hinders 
the system's reliability, while the sought algorithms may be 
infeasible due to computational, communication, or policy 
limitations. 

Decentralized PSSE solutions include block Jacobi iter- 
ations |fl8l , (4), and an approximate algorithm developed 
from the optimality conditions involved ifTUl . However, these 
methods assume again local observability and convergence 
is not always guaranteed. The auxiliary problem principle 
is used in JH), but several parameters must be tuned. Local 
observability is waived in 11291 . where each area is envisioned 
to maintain a copy of the entire high-dimensional state vector. 
A first-order algorithm is proposed, yet its linear convergence 
scales unfavorably with the interconnection size. For a review 
on multi-area PSSE, see also [12J. 

Grossly corrupted SCADA data can potentially deteriorate 
PSSE results. Hence, these meter readings (a.k.a. bad data) 
should be identified or at least detected in a measurement set. 
Statistical tests, such as the x 2 - and the largest normalized 
residual tests, are typically employed for bad data detection 
and identification, respectively (25). Both tests rely on the 
LS-estimated residuals and can thus run only after PSSE has 
been completed. Whenever a bad datum has been identified, 
PSSE must rerun by ignoring that datum. Alternatively, robust 
estimators, such as the least-absolute deviation, the least 
median of squares, or Huber's estimator have been considered 
l3l , |9l , 11221 . |fl~). Towards a multi-area setup, most existing 
distributed PSSE methods rely on the two aforementioned 
tests. Even though metering reliability is improved in the smart 
grid realm, bad data analysis is still a major concern especially 
in the face of malicious data attack threats. Stealth attacks in 
power meter infrastructure are studied in [fl9l , [1T61 . In the 
absence of such attacks, £i-norm based methods have been 
developed in fl6l. 1301. and 171. 

III. Problem Formulation and Preliminaries 

Consider an interconnected power system consisting of K 
control areas, where each area comprises a subset of buses 
supervised by its own control center. The latter is able to 

(i) collect the electrical measurements recorded at area buses; 

(ii) reliably communicate with neighboring control centers; 
and (iii) carry out minimal computational tasks, such as 
solving a (non)linear least-squares (LS) problem. As usual, 
measurements are assumed to be time-synchronized within 
and across control area footprints. A control area here is not 
confined to an ISO region, but it can also model entities 
residing at lower grid levels, such as a substation or a PDC 
ll26l . Alternatively, a control area can be the local balancing 
area under a regional balancing authority's footprint; or under 
a micro-grid setup, a control area may degenerate even to a 
single bus. 

Suppose that Mk measurements aggregated at the fc-th area 
are concatenated in € K Mfc , and obey the model 

z fc = fyfe(x fe ) + w fc (1) 




Fig. 1. The IEEE 14-bus system partitioned into four areas l28l fEl . Dotted 
lassos show the buses belonging to area state vectors x^. PMU bus voltage 
(line current) measurements depicted by green circles (blue squares). 

where x& <G R^* contains the subset of the interconnected 
power system states involved in Zfc.; hk is a vector of Mfc 
functions; and Wfc denotes a disturbance term capturing 
measurement error and modeling inaccuracies. Error vectors 
{\Vfc}fc=r are assumed zero mean, having known covariance 
matrix £fc, and independent across areas. To simplify the 
presentation, the model of ((TJ can be premultiplied by S^ 1 ^ 2 
to yield 

z fe = /ife(x fc ) + w fc (2) 

where Zfc := S _1 / 2 Zfc and similarly for ft-(xfc) and w&. Hence, 
the noise term in (f2]l has identity covariance matrix. 

Functions {hk(xk)}k=i depend on the system's admittance 
matrix and are in general non-linear, except when PMUs are 
involved and complex quantities are expressed in rectangular 
representations. Performing state estimation with non-linear 
hk's entails solving non-convex optimization problems. Typ- 
ically, such models are iteratively linearized via the Gauss- 
Newton method, or by resorting to the so called DC approx- 
imation 11251 . H]. Either way, one arrives at the following 
computationally ubiquitous linear model (cf. (O) 

z fc = HfcXfc + w fe (3) 

where H fe e ^M h xN k is (Jacobian). When {hk}s 

are non-linear, (|3) is the linearized model per Gauss-Newton 
iteration. 

PSSE could be performed locally at each area. Specifically, 
area k could aim at solving 

min /fc(xfc;z fc ,Hfc) (4) 

where /&(•) is a convex function of Xfc for the model in (01; 
and the convex set Xk captures possible prior information, 
such as zero-injection buses, short circuits, or operational lim- 
its 11251 . 0]. Typically, fk is chosen equal to ^||zfc — HfcXfcjj 2 ,. 
For this choice, the minimizer of (0) is the LS estimate (LSE), 
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which yields the maximum-likelihood estimate (MLE) of X& 
if Wfc is Gaussian. To derive other MLEs and/or facilitate bad 
data removal, alternative forms of fk can be employed; cf. 
Section [V] For notational simplicity, the dependence of fk on 
Zfc and Hfc will be henceforth dropped. 

One of the unique PSSE characteristics in interconnected 
areas is that the local state vectors {xfc}^L 1 overlap partially 
(cf. the toy interconnection of Fig. [TJ. Supposing that both 
PMU data (bus voltage and line current measurements) and 
interconnection states (bus voltages) are expressed in rect- 
angular coordinates, the linear model of Q is exact. Area 
2 supervises buses {3,4,7,8}. But since it collects current 
readings on lines (7, 9) and (4, 5), its state vector X2 extends 
to the bus voltages of {5,9} as well. Thus, area 2 shares the 
states of bus 5 (9) with area 1 (4). Notationally, let the N x 1 
vector x collect all the states. For every two neighboring areas, 
say k and I, identify the intersection of their states as Ski- Let 
also Xfc[Z] (x;[fc]) be the sub-vector of x^ (x;) consisting of 
their overlapping variables ordered as they appear in x. For 
example, X3[4] = X4[3] contain the bus voltages of {11, 14}. 

Solving the K problems of the form (|4]i in isolation is 
clearly suboptimal, let alone the fact that control areas may be 
locally unobservable even if external states and their associated 
measurements are ignored. Disagreement on boundary bus 
estimates over critical tie lines is another important limitation 
of solving (01) on a per-area basis. On the other hand, upon 
defining X := {x : x^ G Xk Vfc}, jointly optimizing 
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at a central control center waives all these concerns and can 
considerably improve estimation accuracy. Yet this comes at 
the expense of impractical computational and communication 
load, increased vulnerability, and disclosure of internal sys- 
tem structure. Targeting the "sweet spot" between these two 
extremes, a decentralized method is proposed next. 

IV. Decentralized PSSE 

Tying the local tasks of © into a single optimization 
problem equivalent to (0 can be accomplished by 

K 
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s.t. x fe [Z]=x ; [fc], VZeA4, Vfc 

where Afk is the set of areas sharing states with area k. 

The constraints of © force neighboring areas to consent 
on their shared variables, which renders problems d§] and 
(0 equivalent. But the same constraints couple the estimation 
tasks across areas. To enable a truly decentralized solution, an 
auxiliary variable denoted by x^ G Rl 5fc! is introduced per 
pair of interacting areas k,l. To keep the notation uncluttered, 
symbols x.ki and x^ are used interchangeably for the same 
variable. Then, © can be alternatively expressed as 

K 

X)/*(xfc) (7) 
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{x fe e* fc },{x fci } 



fc=i 



s.t. Xfc[Z] = Xfcj, for all I G Afk, k= 1, . . . ,K. 



The novelty here is solving © using the alternating di- 
rection method of multipliers (ADMM), a method that has 
been successfully applied for distributing several optimization 
problems; see e.g., J2] for a review. In ADMM, Lagrange 
multipliers v^.; G Rl 5fc! l are introduced for each constraint 
of (O. Observe that Vk,i and vi t k correspond to the distinct 
constraints Xfc[Z] = x.ki and x;[fc] = x^, respectively. ADMM 
then exploits the method of multipliers concatenated with an 
iteration of the Gauss-Seidel algorithm. Specifically for ©, 
one first defines the augmented Lagrangian function 



L({x fe },{x fc; };{v fe j}) := 



(8) 
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where c > is a predefined constant. Letting r denote the 
iteration index, ADMM cycles through three steps: 



K +1 } := arg mm L ({x,}, Hlh H,i}) 
K+ 1 } := arg min L ({x* +1 }, {x M }; {v* ,}) 



{Xfc(} 



t+1 

'k,l 



'k,l 



,t+ir 



x^l 1 ) for all k,l. 



(9a) 
(9b) 
(9c) 



At step d9ab . {xfe}s are updated by minimizing the augmented 
Lagrangian while keeping x&/ and Vk,i fixed to their previous 
iteration values; x^; and v^.; can be initialized to zero. 
Likewise, x^; are updated in ( |9bl . Finally, ( |9cT > is a gradient 
ascent of L ({x^. +1 }, {x^ 1 }; {vk,l}) with step size c. 

Inheriting ADMM features, the minimization in d9ab decou- 
ples over control areas. Moreover, by exploiting the problem 
structure, the iterations (|9j can be greatly simplified as pre- 
sented next and detailed in the Appendix. 

Proposition 1. The steps in (O yield the same x| iterates as 
the following steps 
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(10a) 
(10b) 



(10c) 



where Xk(i) is the i-th entry o/xfc/ the set Afj: consists of the 
areas sharing the variable Xk(i) with area k; and xi[i] denotes 
the entry o/x; corresponding to Xk(i) defined for all I G Af^- 
Regarding initialization, state variables x^ are set to arbitrary 
values x°; variables p^i) are initialized to (x^(i) + s^(i))/2; 
and Sk(i) as in ( II Obi ). 

The minimization in (II Oat and the simple update of dlOcb 
are performed at the local centers. The averaging step of (llObb 
is accomplished either through a coordinator, or locally too. 
Either way, the information revealed per area k is minimal. No 
measurements or regression matrices, but only the boundary 
bus states need to be exchanged, and only between the 
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interested neighboring areas. States can be initialized to the 
flat profile ll25l . or some prior estimate. To implement the 
LSE given a non-linear measurement model [cf. (2)], the 
derived algorithm should be used per Gauss-Newton iteration. 
At each Gauss-Newton iteration, an approximate linear model 
is updated locally, and the decentralized iterates are initialized 
using the latest state estimates. 

Supposing {/fc(x fc )}^L 1 are convex, and {X k }£ =1 are 
closed convex sets, the cost in (0 evaluated at {xjj.} generated 
by ( [Tol l converges under mild conditions (typically met in 
practice) to the optimal value of (0 [2, p. 17]. Hence, when 
the power system is globally observable, i.e., the centralized 
LSE is unique, ADMM iterates minimizing the LS cost 
converge to it. Notwithstanding, observability is not necessary 
for the method to converge: if the system is unobservable, 
ADMM iterates converge to one of the multiple LSEs. The 
equivalence to the centralized LSE ensures that global ob- 
servability analysis, assumed given in this work, carries over 
to the decentralized approach too. In addition, ADMM iterates 
have been shown to be resilient to asynchronous updates and 
random failures in the inter-area communication links ll32l . 

For notational convenience, define per area k a diagonal 
matrix Dj. with (£, i)-th entry \Afl\. Recall that by definition, 
is zero for strictly local states. Also, define the JV/t- 
dimensional vector pt with i-th entry the p\,{i) of dlOcb when 
| > 0, and otherwise. Hence, the second term in the cost 

of ( 11 Oat is expressed as §||D^ (xfc— p^.)|||. For the typical case 



of the unconstrained LSE, the minimizer of (II Oat is clearly 
given in closed form by 



• r t+i 



(H^H fc 



(Hjz fc + cDfcp*.) (11) 



which is a simple yet systematic modification of the local LSE. 
Existing PSSE software can be straightforwardly exploited 
for finding (fTTT i by simply adding yfcD^pl, as pseudomea- 
surements with diagonal loading matrix ^fcD\! 2 '. Note that 
pseudomeasurements are actually added only for the shared 
states. As empirically observed in Section |VI] the procedure 
terminates after a few tens of iterations. 



V. Decentralized Bad Data Analysis 

Time skews, instrument/communication failures, infrequent 
instrument calibration, reverse wiring, and parameter uncer- 
tainty can yield grossly corrupted SCADA measurements. In 
the cyber-physical smart grid context, bad data are not simply 
unintentional metering faults, but can also take the form of 
malicious data injections 11241 . If an intruder can counterfeit 
some meters so that the attack vector lies in the range space 
of the PSSE regression matrix, the attack is undetectable and 
can arbitrarily perturb state estimates Ifl9l , lfl6l . Excluding 
these naturally termed stealth attacks, this section focuses 
on bad data identification. After presenting an outlier-aware 
estimator, interesting connections are established, to efficiently 
implement it using the decentralized approach of Section ITVl 

A. Interconnection-Wide Bad Data Identification 

The local quantities {z/,., Hj., Wfc}^ =1 in can be 
vertically stacked in z, H, and w, respectively. The 



interconnection-level model then reads z = Hx + w, where 
the dimensions of z and x are M = X)fc=i an d -/V, in 
the order given. However, when bad data are present, a more 
pertinent model is 



z = Hx 



(12) 



where o is an unknown vector with its entry o(i) being non- 
zero only if z(i) is a bad datum lfT4l . [16], Q. Recovering 
both x and o essentially reveals the state and identifies 
faulty measurements. Such a mission however seems rather 
impossible, since the model in (fT2l is unobservable even if 
H is full column rank. By capitalizing on the sparsity of o 
though, interesting results can be obtained lfl4l . If tq bad data 
are expected, then one would ideally wish to solve 
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o|| < t 



(13) 



But the ^o-pseudonorm, i.e., the number of non-zero o(i)'s, 
renders ([T3l NP-hard. The problem is computationally in- 
tractable even for moderate-sized interconnections and small 
To. Building on the premise of compressed sensing, a prac- 
tical robust estimator can be derived after relaxing the £q- 
pseudonorm by the convex ^i-norm as (see also lfT4l ) 

1 
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(14) 



for a selected constant T\ > 0, or in the Lagrangian form 
1 



(x, 6) <G arg min 

x£^,o 2 



z — Hx — o L + A o 



(15) 



where A denotes a positive parameter. The optimization prob- 
lem in (Q0 is a convex quadratic program and can be solved by 
interior point-based methods. The estimator of ([TBI allows for 
joint state estimation and bad data identification. Even when 
some measurements are deemed as corrupted, their effect has 
been already suppressed, and the state estimate remains valid. 

B. Interesting Links 

The two statistical tests traditionally used for bad data 
analysis rely on the model z = Hx + w, and the residual 
error achieved by the unconstrained LSE. The latter can be 
expressed as r := Pz = Pw, where P := I-H(H T H)^ 1 H T 
is the so called "residual sensitivity matrix" satisfying P = P 2 
IU Ch. 5]. Clearly, when w is Gaussian, r is Gaussian too 
with covariance matrix P. The x 2 -test compares ||r||2 against 
a threshold to detect the presence of bad data 11251 . (TJ. The 
largest normalized residual (LNR) test computes 

rmax := max (16) 
ie{i,...,M) y/P(i,i) 

where P(i, i) is the (i, i)-th entry of P; note that in the absence 
of critical measurements, < P(i, i) < 1 for all i. Metric f max 
is then compared to a prescribed threshold to identify a single 
bad datum jT] Sec. 5.7]. Adopting the proof in [16, Prop. 1], 
the following claim can be established. 

Proposition 2. The optimization problem in d 1 3b with t = 1 
over X = WL N , and the LNR test of H6\ are equivalent in 
terms of the measurement being detected as bad. 
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For multiple bad data, this connection is unclear. If a 
measurement is deemed as outlier, PSSE is repeated after 
discarding this bad datum, the LNR test is re-applied, and 
the process iterates till no corrupted data are identified. Even 
though rank-one updates can be used to speed up computa- 
tions, the process becomes complicated for multi-area grids. 

Returning to the convex relaxation (flSt for X = M. N , note 
that when A->oo, the minimizer 6 becomes zero, and thus, x 
reduces to the LSE. On the contrary, by letting A — > + 021, 
the solution x coincides with argmin x ||z — Hx||i, meaning 
the least-absolute value estimator Q, J9). 

For finite A > 0, x of ( fT3] l corresponds to Huber's M- 
estimator; see (14\ and references therein. Based on this 
connection and assuming Gaussian w, tuning parameter A can 
be set to 1.34, which makes the estimator 95% asymptotically 
efficient for bad data-free measurements lETI p. 26]. 

Alternatively, Huber's estimate can be expressed as the 
x-minimizer of mm xe x.u> + ^ll z — Hx — as 

shown in |[20| . The bad data identification performance of this 
minimization is analyzed in 11301 . 

To solve (T5[ , one can first minimize over x and then over o. 
For X = R w , the x minimizing © is (H T H)~ 1 H T (z - o), 
and thus, minimizing ( TT3T > reduces to ifTBI . J7) 

mm i||P(z-o)||2 + A||o|| 1 . (17) 

A minimization similar to (TTtT > is derived in ifTBI using a 
generalized likelihood ratio test. By assuming a Bayesian 
prior x ~ Af(0, lfl6l suggests solving ( fl7l ), but with P 
substituted by I — H(H T H + X~ 1 )~ 1 H T . In any case, matrix 
P couples the minimization over o across areas and compli- 
cates a decentralized implementation. An efficient distributed 
algorithm for solving ( fT3T > is developed next. 

C. Distributed Robust Algorithm 

Consider now the system-wide minimization in (fT3T > under 
the decentralized PSSE format of Section [IV] To this end, 
partition o into subvectors o^'s conforming to the partition of 
z into Zfc's, and define the local functions 

/fe(x fc ,o fc ) := - \\z k - H fc x fc - o k \\ 2 2 + A||o fc ||i. (18) 

Critically, notice that {0^} belong to a single area and there is 
no need for sharing them. Similar to the way (0 was obtained 
from Q, the decentralized equivalent of ( [TBI is 

K 

x c ^w in w x 5I/fe(xfe,o fc ) (19) 

{x k £X k },{o k },{x kl } f— ' 
k— 1 

s.t. Xfe[Z] = xm, for all I € Af k , k = 1, . . . ,K. 

Proceeding with the ADMM methodology, the augmented 
Lagrangian of (fT9l is similar to the one in © apart from 
/fe(xfc) being replaced by f k (x. k ,o k ). Having three instead of 
two primal variable sets offers two algorithmic alternatives: 
the additional variables can be jointly optimized either with 
{xfc}s in step ( |9al >, or with {xfc;}s in step ( |9bl . The second 
choice yields computational efficiency as described next. 

Two are the key observations here. First, that the augmented 
Lagrangian of the problem is separable with respect to {o^} 



Algorithm 1 Decentralized Robust PSSE (D-RPSSE) 
Require: {Hfc,Dfc,Zfc}, and positive c, A. 

1: Initialize {x k ,p k ,s k } as in ( fTOb ; and {o k } to zero. 

2: for i = 1,2,... do 

3: Every area updates its x^ +1 as in ( l20l . 

4: Neighboring areas exchange shared state variables. 

5: Every area updates its s k +1 via ( llObl ). 

6: Every area updates its p k +1 via ( llOcb . 

7: Every area updates its o£, +1 through (f2Tb-(f2Zb. 

8: end for 



and {xfc;}, and hence, the optimization required at the ADMM 
step j9bl decouples over the two variable sets. Second, the 
updated {o^ +1 } vectors appear only in ff.(xk,6i), and do 
not affect the x^ 1 and v^ 1 updates. Since the iterations of 
([Tot were derived from ((9) for generic {/fe(xfe)}, they can be 
readily extended to the robust PSSE case as follows. 

In step ( II Oat , {/^(xfc)} should be replaced by {/fe(xfe, oj.)}. 
For X k = M. Nk and using the notation introduced before (fTTT i. 
state variables can be updated in closed form as 

x 4 ^ 1 := (H^H fc + cDk)' 1 (H^(z fe -o* ) + cB k pl) . (20) 

Interestingly, the update of d20l > is the LSE of (fTTT i slightly 
modified: measurements z k have been substituted by the bad 
data-compensated measurements (z k — oj.). 

Steps (llObb - dlOcb are left untouched; guaranteeing the con- 
sent between shared states is unrelated to the local functions 
/fc(xfc). The variables {o^.} can be finally updated as the 
minimizers of f k (pc k +1 ,o k ) as required by the ADMM step 
(l9bl . Interestingly, the solution of the latter minimization is 
provided in closed form too as (cf. [2. p. 32]) 

o^:=[z fc -H fc xHt (2D 
where [x}~^ denotes the simple thresholding operator 

{x + A, x < —A 
0, \x\ < A (22) 

x — A, x > A 

understood entry-wise in (|2T1 >. Intuitively, for measurements 
with a small tentative absolute residual, the corresponding 
o k (i) becomes zero. Otherwise, the bad datum residual is 
artificially shrunk by A towards zero via a non-zero o k (i). 

The novel robust decentralized algorithm, called D-RPSSE 
hereafter, is tabulated as Alg.Q] Compared to the decentralized 
LSE of ([TOj-dTTJ, D-RPSSE maintains software compatibility 
too. On top of adding ^/cDj|/ 2 p! as pseudo-measurements, 
resilience to bad data comes by simply off-setting local 
measurements by and via the thresholding rule of (l22l . 
Robust state estimates and bad data identification are jointly 
acquired without repeating PSSE as required by the LNRT 

VI. Simulated Tests 

The developed decentralized state estimators are numeri- 
cally tested on an Intel Duo Core @ 2.2 GHz (4GB RAM) 
computer using MATLAB. Two power network benchmarks 
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TABLE I 

Average standard deviation per state 



Estimator 


IEEE 14-bus grid 


IEEE 118-bus grid 


Internal LSE 
Local LSE 
Global LSE 


3.4- io~ 3 
3.1 ■ icr 3 
1.0 ■ 10 -3 


4.1 • 10" 4 
4.0 • 10~ 4 

2.2 ■ 10 -4 



are initially considered, namely the IEEE 14- and the 118- 
bus systems; while later a 4,200-bus is generated based on the 
IEEE 14- and 300-bus systems |28|. Their admittance matrices 
and the underlying power system states are obtained using 
MATPOWER 1 33]. For all systems, the state vector contains 
the real and imaginary parts of all bus voltages. Measurements 
consist of PMU recordings on bus voltages and line currents, 
expressed in rectangular coordinates too. Measurement noise 
is simulated as independent zero-mean Gaussian with standard 
deviation per real component 0.01 and 0.02 for voltages and 
currents, respectively ||33l . 

For the IEEE 14-bus grid, PMU sites and types are shown 
in Fig.[T] 6 bus voltage and 17 line current meters yield a total 
of 46 measurements corresponding to a redundancy ratio of 
1.6 E3. For the IEEE 118-bus grid, PMU sites are selected 
uniformly at random: 77 bus voltage and 205 line current 
meters are utilized, yielding a redundancy ratio of 2.4. The 
IEEE 14-bus grid is partitioned into the 4 areas depicted in 
Fig.Q] while the IEEE 118-bus system is split into 3 areas as 
in E3 Fig. 4]. 

A reasonable question is whether interconnection-wide 
PSSE offers any improvement over local PSSE. To this end, 
three estimators are numerically compared: First, an estimator 
that uses only the measurements related to its own area, 
henceforth called "internal." Second, a "local" estimator which 
extends its state to boundary buses that can be reached via 
tie line measurements. Lastly, the interconnection-wide or 
"global" estimator. The average standard deviation per state is 
empirically computed over 100 Monte Carlo runs. Table Ulists 
the standard deviations for the two power grids. The IEEE 118- 
bus grid attains improved estimation accuracy due to its in- 
creased redundancy ratio. More importantly, the improvement 
of the local over the internal estimator is marginal, whereas 
the accuracy of the global estimator roughly doubles. This 
observation speaks for the importance of interconnection- wide 
PSSE even when local observability is guaranteed. 

A. Testing the Decentralized LSE 

States are initialized here to the flat profile. Even though 
iterations ( TTOb are guaranteed to converge to the optimal value 
of (f5]) for any c > 0, the value of c affects the convergence 
rate. After scaling the data to obey the model in Q, c is 
empirically set to 10 4 . Two performance metrics are adopted: 
the per area error to the centralized solution of (0, denoted 
by e\ c :=||x[. c ' 1 — x|, ||2/iVfc, and the per area error to the true 
underlying state defined as e\ :=||xfc— x^l^/A^. 

Fig. [2] depicts the e\ and the e\. curves obtained for the 
IEEE 14-bus network. The almost flat e\ curves shown at 
the top of the figure correspond to the decentralized algorithm 
of |29| whose step sizes have been optimized. Based on the 
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Fig. 2. Per area error curves ej, 's (bottom) and el 's (middle) for the 
decentralized LSE of the IEEE 14-bus system of Fig.[T] The almost flat curves 
(top) show the corresponding el error curves for the algorithm of H291 . 
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5 's (bottom) and el 's (top) for the 



Fig. 3. Per area error curves 

decentralized LSE of the IEEE 14-bus system without local observability. 



e\ c curves, the algorithm of (flOt converges to the centralized 
solution. Interestingly though, as indicated by the e\ curves, 
accuracy of around 10 -3 dictated by the measurements is 
reached in 10-15 iterations. On the other hand, the algorithm of 
[29 1 attains the same accuracy after around 10,000 iterations. 
Being a first-order method, the algorithm in ||29l incurs per 
iteration complexity much smaller than ([Tol l, yet it does not 
fully exploit the capabilities of local PSSE solvers. Moreover, 
the high number of iterations required translates to increased 
inter-area communication overhead. 

To evaluate the new algorithm in scenarios where local 
observability does not hold, the electric current measurement 
on line (6,11) is removed from the IEEE 14-bus measurement 
set (cf. Fig. Q3. Since the only measurement directly related 
to bus 11 is the current measurement on line (10, 11) and that 
is collected by control area 4, area 3 is locally unobservable. 
The error curves obtained and plotted in Fig. |3]verify that the 
developed method does not require local observability. 

Switching to the IEEE 118-bus benchmark, similar results 
are observed. As evidenced by the e\ c and e\ curves 
plotted in Fig. @] the decentralized solution attains the desired 
statistical accuracy within only 5-10 iterations. 
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Iteration 

Fig. 4. Per area error curves et 's (bottom) and e£ 's (top) for the 
decentralized LSE of the IEEE 118-bus system of 1231 Fig. 4]. 

TABLE II 

Mean-Square Estimation Error in the Presence of Bad Data 



Method 


GA-LSE 


LSE 


LNRT 


Huber's 


(SO) 


0.0278 


0.0278 


0.0286 


0.0281 


(SI) 


0.0313 


0.0318 


0.0331 


0.0322 


(S2) 


0.0336 


0.1431 


0.0404 


0.0390 


(S3) 


0.0367 


0.1434 


0.0407 


0.0390 



B. Testing the Decentralized Robust Estimator 

The centralized versions of bad data analysis methods are 
compared first. The IEEE 14-bus grid of Fig. [T]is considered 
under the following four scenarios. (SO): no bad data; (SI): 
bad data on line (4, 7); (S2): bad data on line current (4, 7) and 
bus voltage 5; and (S3): bad data on line current (4, 7), tie line 
current (10, 11), and bus voltage 5. In all scenarios, bad data 
are simulated by multiplying the real and imaginary parts of 
the actual measurement by 1.2. The performance metric here 
is the ^2-norm between the true state and the PSSE, which is 
averaged over 1,000 Monte Carlo runs. 

Table [EI] lists the results obtained by the four centralized 
algorithms tested: (a) an ideal yet practically infeasible genie- 
aided LSE (GS-LSE), which ignores the corrupted measure- 
ments; (b) the regular LSE; (c) the LNR test-based (LNRT) 
estimator with the test threshold set to 3.0 JT); and (d) Huber's 
estimator of O with A = 1.34 and X = R N . For (S0)-(S1), 
the estimators perform almost similarly. The few corrupted 
measurements in (S2)-(S3) can deteriorate LSE's performance, 
while Huber's estimator performs slightly better than LNRT. 
Computationally, Huber's estimator is implemented using it- 
erations (1201- (|2H for the interconnection-wide vectors x and 
o with c = 0. The algorithm is terminated when the £2 -norm 
between the two last state iterates becomes less than 10~ 4 . On 
the average and for all scenarios (S0)-(S3), Huber's estimator 
converges in 6-12 iterations and within 1.3 msec, while LNRT 
requires 2.6 recalculations of (TTBl in 1.5 msec. The computing 
times were also measured for the IEEE 118-bus grid without 
corrupted data. Interestingly, the average time on the IEEE 
1 18 -bus grid without corrupted data are 3.2 msec and 81 msec, 
respectively. Of course, efficient updates for LNRT can be 
devised, but their decentralized implementation is not obvious. 
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Fig. 5. Per area error curves ej, 's (bottom) and e l k 's (top) for the D- 
RPSSE algorithm on the IEEE 14-bus benchmark under (S3). 
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Fig. 6. Per area error curves ej. 's (bottom) and ej. 's (top) for the 
D-RPSSE algorithm on the IEEE 118-bus benchmark having 10% of the 
measurements corrupted. 



Focusing on decentralized implementation, the D-RPSSE 
algorithm (cf. Alg. [U is considered next. Technically, the x 
minimizing ( fl~5l > may not be unique. However, a sufficient 
condition for its uniqueness is provided in ifTT] Th. 3.6], and 
this condition was satisfied in all the remaining tests. D-RPSSE 
was first tested on the IEEE 14-bus grid under scenario (S3). 
The associated e\ c and e| curves are depicted in Fig. [5] 
The decentralized iterates approach the underlying state at an 
accuracy of 10~ 3 in 30 iterations in 12.1 msec. In comparison, 
the respective time for the LSE was 5 msec. Finally, for the 
IEEE 118-bus system, 10% of the measurements are corrupted 
in the way described earlier. The corresponding error curves 
are plotted in Fig. [6] The time needed to achieve a 10~ 3 — 10 -4 
accuracy (10 iterations) is 12.1 msec versus 3.7 msec for the 
related LSE. 

The decentralized algorithms were finally tested on a larger 
power network: a 4,200-bus power grid that was generated 
using the IEEE 14- and 300-bus grids as follows. Each one of 
the 300 buses of the latter is assumed to be a different area, and 
is replaced by a copy of the IEEE 14-bus grid. Additionally, 
every branch of the IEEE 300-bus grid is now an inter-area 
line whose terminal buses are randomly selected from the two 
incident to this line areas. Measurements, bad data, and c are 
selected as in the tests for the IEEE 118-bus grid. 
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Fig. 7. Average error curves J2k°=i e l c / 300 ( bottom ) and Efei°i e { / 300 
(top) for the LSE and D-RPSSE algorithms on a 4,200-bus grid. 



Fig. [7] shows the corresponding error curves that are now 
averaged over the 300 areas. Given bad data-free measure- 
ments, the decentralized LSE approaches the underlying state 
at an accuracy of 10~ 3 in approximately 10 iterations or 
6.2 msec; the centralized LSE finished in 93.4 msec. For 
a 10% of the measurements being bad, D-RPSSE yields 
an accuracy of 10 -3 in less than 20 ADMM iterations or 
15.2 msec; the centralized robust estimator needed 193.5 msec. 
The convergence to their unique centralized counterparts is 
illustrated by the bottom curves. These tests corroborate that 
(i) the decentralized algorithms are basically insensitive to 
the variation of c; and (ii) the convergence time for both 
the decentralized LSE and D-RPSSE scale favorably with the 
network size. It is worth mentioning that the reverse topology 
where each of the 14 nodes of the IEEE 14-bus grid is replaced 
by a IEEE 300-bus system was tested too. Under this second 
architecture, the algorithms converged even faster due to the 
looser area coupling. 

VII. Conclusions 

Distributed and robust state estimators have been treated 
here under a systematic manner. The proposed algorithms 
waive local observability requirements and maintain backward 
compatibility. With a few minimal data exchanges between 
neighboring areas, local control centers can acquire highly 
accurate estimates for the part of the interconnection they are 
responsible for, and simultaneously identify (un)intentionally 
corrupted data. The novel framework accommodates several 
important modifications of the PSSE problem, such as con- 
straints (e.g., zero-injection buses, operational limits) and 
different MLEs. Exciting issues that emerge for future research 
include this work's application to generalized state estimation, 
its applicability to non-convex problems, and re-weighted 
versions of <fT3T > lfl4l . 

Appendix 
A useful lemma is shown first. 

Lemma 1. For every pair of adjacent areas k and I, the 
Lagrange multipliers updated by d9cl > satisfy \j, ; + v* fe = 
per iteration r > 0. 



Proof: Note that step d9bb decouples over the x^'s, while 
the minimizers can be shown to be 



t+i 



x 



t+lr 



k] 



'k,l 



'Lk 



2c 



(23) 



Next, consider the updates of Vk,i and v/ ^ according to 
step d9cb . Adding the two updates by parts and solving for the 



common term x 



t+i 
hi 



, yields 



'k,l 



'Lk 



'k,l 



f l,k 



,t+l 



[k] 



Kl 2c 2c 2 

By equating the right-hand sides of (l23l and the last equation, 
the claim of the lemma follows readily. ■ 
Proof of Proposition {J} The optimization in d9al > is 
separable across areas. Upon completing the squares, the 
optimization task for area k during step ( |9at becomes 

2 



min/fc(x fe ^ 



-y 

2 ^ 



x fc [/]- 



(24) 



Apparently, the £2 -norms in (l24l decouple over the entries 
of the vectors involved. However, a single entry of x^, say 
Xk(i), may be shared not only between areas k and I, but 
rather among area k and all the areas in Af^. If Xki[i] (vk,i[i\) 
denotes the entry of x/y (vfe ;) corresponding to Xk(i), the 
optimization in d24T i can be expressed as 

min/ fe (x fe ) + ^ Mil : ' " 

where for all k, and i = 1, 



y fe +1 «y 



(25) 



, N k with Aft ^ 0, define 




(26) 



t+i 



By Lemma Q] step d9bD simplifies to 
0.5 (x£ +1 [Z] + x* +1 [A:]). In other words, the auxiliary 
variable x« is the average of the shared state variables across 
areas k and / per iteration. By eliminating the auxiliary 
variables x^; from the updates of d26i > and d9cl >, step d9bl can 
be dropped. Hence, one arrives at the iterates 



argmin/ fe (xfc) + - V \Nl\ (x k {i)-p\{i)y 



Vfc.i + c 



L [fc] 



t+lf -\ 



^( 4+1(l ' ) + Mig^ +1 « 



1_ V- v _ 



(27a) 
(27b) 



c 

(27c) 



To further simplify the iterations, define the average of the 
shared variable Xk(i)'s copies over Ml as the s|(i) in ( 1 10bb . 
Define also the average of the weighted Lagrange multipliers 
U !W := S;a^! v k i[^\l{ c Wk\)- Then, J27cl > can be written as 

Pl +1 W ■■= \ (4 +1 (<) + 4 +1 M) - v> k +1 (i)- (28) 
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With {\~k.i} initialized to zero, {uk(i)} can be recursively 



updated as u k 



t+Xi 



(i) :=uX{i) + {x^\i)-s^\i))/2. Hence, 
update (I28t can be alternatively performed as in (llOct . Collect- 
ing (I27al i. the definition of {s^,(i)}, and the recursive updates 
for {Pfc(i)}, one readily arrives at (Tint . ■ 
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